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Abstract. A common feature of biological networks is the geometric 
property of self-similarity. Molecular regulatory networks through to cir- 
culatory systems, nervous systems, social systems and ecological trophic 
networks, show self-similar connectivity at multiple scales. We analyze 
the relationship between topology and signaling in contrasting classes of 
such topologies. We find that networks differ in their ability to contain 
or propagate signals between arbitrary nodes in a network depending 
on whether they possess branching or loop-like features. Networks also 
differ in how they respond to noise, such that one allows for greater 
integration at high noise, and this performance is reversed at low noise. 
Surprisingly, small-world topologies, with diameters logarithmic in sys- 
tem size, have slower dynamical timescales, and may be less integrated 
(more modular) than networks with longer path lengths. All of these 
phenomena are essentially mesoscopic, vanishing in the infinite limit but 
producing strong effects at sizes and timescales relevant to biology. 

Biological networks exhibit a wide range of structural features at mul- 
tiple spatial scales |lji3j. These include local circuitry reflecting the logic 
of regulation among small numbers of elements [1], and motifs of statis- 
tically over-represented patterns within larger networks of interactions [5], 
through to macroscopic properties of complete networks including the de- 
scription of the degree distributions and the large scale geometric features 
of networks [2] • Among the most interesting geometric properties of biolog- 
ical networks is the property of self-similarity or scale invariance [HI E] j in 
which characteristic topological features are present at all scales from the 
local organization of individual nodes, through to aggregations at the largest 
network scales. 

For genetic and proteomic regulatory networks, as well as social net- 
works and a variety of distribution networks, including respiratory and cir- 
culatory networks, the mechanisms generating self-similar structures have 
been well explored [8Hl7j. A growing body of empirical work investigates 
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2 DYNAMICS AND PROCESSING IN FINITE SELF-SIMILAR NETWORKS 

self-similar network structures, including motif overabundances at differ- 
ent coarse-grained scales. The topology of networks under coarse-graining 
(agglomeration) of nodes has formed a central focus in both empirical [H] 
and theoretical [ 18H24j work. However, the functional implications of these 
topological properties remain poorly understood. 

Functional explanations of self-similarity tend to fall into one of three 
broad classes. Robustness explanations consider the connectivity properties 
under perturbation, and contrast, for example, scale-free and exponential 
degree distributions [55H27] . Adaptive optimization theories argue that self- 
similarity provides an efficient means of provisioning densely distributed 
resource sinks with a minimum of cable cost \28\ [29] . Hence networks such as 
the circulatory system can efficiently provide energy-rich compounds to the 
cells of the body, and neural networks can efficiently integrate information 
from a large variety of sensory inputs |30[ [3T] . 

Finally, neutral theories suggest that self-similarity is not in itself an opti- 
mized property of biological networks, but a consequence of highly conserved 
developmental processes with local rules of assembly that generate charac- 
teristic macroscopic properties |32H35j . Mathematical studies have shown 
how motif abundances can be the consequences of constraints on large scale 
topological properties [36] ; conversely, large-scale topological features might 
arise from constraints on a single local property ^7\. In either case, the 
connection between the large and small-scale properties of a network may 
have emerged first without functional meaning. 

In this contribution we investigate the functional implications of self- 
similar assembly, as the nodes of a system adjust their internal states in 
response to their neighbours and in the presence of environmental noise. 
We find a tension between the small-world properties of a network and 
the rapidity of the transition to an ordered phase. For a fixed number of 
vertices and links, self-similar networks with small- world properties tend 
to show more gradual transitions, both dynamically, and as a function of 
noise. The nested-hub structure of such networks provides a bottleneck 
restricting the possible paths to distant parts of the network. By contrast, 
hierarchical assemblies, characterized by nesting and a more open structure, 
have a sharper transition to the ordered state. 

Our most surprising results show that while there is some advantage to 
small-diameter, small-world networks in the high-noise regime, a completely 
different architecture - that of nested networks, which eliminates bottlenecks 
at the expense of longer average paths - provides greater integration in the 
low-noise regime. Further, small-world networks produced by branching 
show dramatically longer dynamical timescales than their nested counter- 
parts. Both features of the nested architecture are driven by the presence 
of multiple paths between points, as we establish both by simulation and by 
analytic calculation of the graphical structures that underlie the problem. 

A central theme of our investigation are the differences between these 
constructions in the mesoscopic regime: ^ 1, but finite. As we shall see. 
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Figure 1. Branching (left) and nested (right) iterations on 
a simple graph. 

various properties that vanish in the infinite-size limit lead to pronounced 
differences in behavior at the finite scales relevant to biology. Our use of 
both analytic and numerical techniques allows us to investigate two dis- 
tinct regimes relevant to this mesoscopic phase: analytic results describe 
the finite-size-infinite-time equilibrium, while numerical simulations show 
the finite-size-finite-time properties, relevant in the case of strong non- 
equilibrium effects. 

1. Constructing self-similar networks 

We first introduce a deterministic, algorithmic approach for construct- 
ing hierarchical, self-similar networks. Our methods use the notion of a 
construction template, or base motif, that provides the seed for self-similar 
construction. Alternative stochastic approaches include defining hierarchi- 
cal assemblies in terms of correlations in an otherwise random network [38j , 
through biases introduced into an ensemble |39| , or through high-dimensional 
generalizations of deterministic constructions [41^ I41j . The pseudofrac- 
tal [i2] and the "fiower" graphs of are an alternative deterministic 
construction. An advantage of the deterministic assemblies is that exact 
calculations of critical behavior become possible. 
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The self-similar networks we describe take two forms, depending on their 
assembly mechanisms - see Fig. [l| The assembly mechanism is stated for- 
mally in Appendix A; it relies on the specification of (1) a motif pattern 
M (in Fig. [l| for example, the triangle), and (2) a method /, of replacing 
nodes in the pattern by new, "smaller scale," copies of the original motif. 
This method can then be iterated, deterministically, to produce networks of 
increasing size and complexity. 

Visually, our constructions possess fractal-like properties, with self-similarity 
upon coarse-graining. Our formal definition of the construction of these net- 
works amounts, in the reverse direction, to a specification of a renormaliza- 
tion group transformation |43j . 

The two simplest choices of node replacement lead to two different kinds 
of network: a branching topology, characterized by the absence of large-scale 
loops, and a nested topology, where the loop structure of the base motif is 
replicated on all scales. We consider the scaling of average network diameter, 
(d), the geodesic distance averaged over all distinct pairs of nodes. 



1.1. Branching Assembly. As a network grows, a particular unit may 
preserve the same "local-structural" relationship at each level of the hierar- 
chy. For example, the central node of a star may be the central node of the 
network at all levels of iteration. These networks are characteristic of circu- 
latory and vascular networks, where each node, regardless of its position in 
a hierarchy, tends to perform the same function |44| H5] . 

In the formalism of Appendix A, such a mapping is provided when f{i,j) 
is equal to i. Iterations increase inequality in the network, producing de- 
gree distributions characterized by a motif scale, with a exponential tail 
of nodes with a "runaway" infiuence on the rest of the system. Biological 
networks with this property include the neural network of C. elegans |46j 
and the small- world networks of Ref. [TT]. Exponential tails to the degree 
distribution are found also in the original Erdos-Renyi random graph. 

An illustration of a branching iteration on the triangle is shown on the 
left column of Fig. [Tj As the order increases, loops, loops with free loops, 
and so forth are produced; the highest vertex degree increases exponentially 
in the number of vertices - these are nodes on the largest "super-loop." All 
loops, or, in general, subgraphs, may be detached by a single cut. 



1.2. Nested Assembly. In contrast to branching iterations, a nested iter- 
ation is when the unit - vertex or subgraph - takes on the characteristics 
of its neighbors. A node is no longer restricted to a single local structure, 
but participates in structures at multiple spatial scales. This is common 
in communication and computational networks, characterized by extensive 
feedback loops and connectivity to topologically distinct regions. 




Figure 2. Two generations of nested assembly for a com- 
mon E. coll motif; the base graph M is shown in the upper- 
right corner. The large-scale <l> pattern is topologically 
equivalent to the base motif. 

In the formalism of Appendix A, this second mode of network assembly 
is provided by f{i,j) equal to j. While branching structures look tree- 
like nested networks are characterized by the replication of subgraph loop 
structures on increasingly larger scales. 

Nested networks are shown on the right-hand column of Fig. [T| a more 
complicated example is that of Fig. [2| where two iterations of a motif over- 
abundant in E. coll ^48j is shown. As shown in Fig. [3| nested graphs have 
larger diameters; they lack the "small-world" property of logarithmic scaling 
of diameter with size found through replication. 

1.3. Topological Properties. The two cases we have considered, pure 
branching or nesting of a motif pattern M, can be considered extremes 
of how a network might assemble. Branching networks tend to increase in- 
equalities in the degree distributions of vertices while keeping loops at an 
approximately constant density, whereas nested networks create many more 
loops while reproducing (almost) the degree distribution of the lower levels. 

Another difference between the assembly rules is the scaling of the average 
diameter. As can be seen in Fig. [3], branching, with its tree-like hierarchy 
of central hubs, produces small- world graphs where the network diameter 
scales only as the logarithm of network size [3^ . 

This can be understood by considering successive construction steps. The 
increase in the number of vertices at each iteration leads to an exponential 



Formally: they have logarithmic scaling of diameter, and no loops on scales above the 
motif size. 
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Figure 3. "Small world" behavior in hierarchies. Shown is 
the scaling of the average diameter, (d) with N, the num- 
ber of vertices, for the branching (solid line), nested (dashed 
line), and mixed (intermediate, dotted line) hierarchies on 
the triangle motif. Branching hierarchies have the small- 
world property; {d) ~ InA^, while nested hierarchies scale as 
a power law with index roughly that of the motif diameter: 
(d) ~ A^('^m). Mixed hierarchies - shown here, those built of 
alternating branching and nested iterations - also scale as a 
power law, but at a slower rate than the nested case. 

scaling of system size with iteration number: 

(1) Ni - N,_i = {n - l)Ni^i, 

where n is the number of vertices in the base motif and Ni is the total 
number of vertices at iteration i. The tree structure of branching networks, 
however, means that distance between nodes on the perimeter (i.e., those 
nodes with the greatest separations on the graph) only increase by a con- 
stant, proportional to n. The diameter, in other words, increases linearly at 
each iteration, and so the network as a whole has only a logarithmic scaling 
between diameter and system size. 

Diameter increases much more rapidly for the nested structures. Crossing 
such a structure requires crossing the nested subgraphs, and so the separa- 
tion between distance points increases proportional to Ni as well as n. This 
leads to a power-law scaling, with diameter increasing as a power of the 
number of vertices. The index of the average diameter scaling is the average 
diameter of the base motif. All of these relationships are shown in Fig. [3| 

When two formation patterns are mixed so that a graph may switch from 
branched to nested, the results is networks as in Fig. |4j As a means of 
self-organization, mixed iterations lead to a kind of self- dissimilarity, where 
coarse-graining reveals different organizational principles at different levels. 
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Figure 4. Mixed iterations on the three-vertex loop. 
Branching then nesting (left); nesting then branching (right.) 
Subsequent iterations define connectivity on increasingly 
larger scales. 



Evidence for self-dissimilarity under coarse-graining has been found in both 
biological and engineered systems [19] . 



2. Signaling, Modularity and Noise 

Whereas some network features and motif structures could arise through 
simple genetic or developmental stochastic processes, non-essential conser- 
vation rules, or due to the local constraints of physics and chemistry, we 
show that two extremes of network structure can still have important func- 
tional implications for the ways in which different parts of a network become 
correlated, or exchange information. 

A crucial concept for this work is that of noise, which accounts for the 
influence of random events and unobserved degrees of freedom in a system. 
Particular examples of noise might include the small-number fluctuations in 
reactants that affect metabolic processes, the coupling of observed neurons 
to part of the larger, unobserved network, or the use of mixed strategies in a 
game-theoretic system. In the absence of strong theories for the noise prop- 
erties of a particular case, we use a maximum-entropy model, as described 
below. 

In particular, for our dynamics, we take nodes to have two states ("on" 
or "off" ) approximating the discrete switching events observed in a number 
of systems from the cellular [SUj to the social [^. We follow recent work 
showing the dominance of pairwise interactions in system behavior j52H54j . 
and consider networks with pairwise constraints described by a maximum 
entropy model. This amounts to requiring the full state of the system - 
the switch-state of all vertices - be given by the Boltzmann distribution. 
This is then the Ising model on an arbitrary graph. 
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We can then write the Boltzmann distribution of spins P{{a}), as given 
by the set of pairwise constraints Jij, and external fields hi, 



The Jij are simply the edges of the different networks we consider. The 
system is in the maximum entropy state with only one observable - average 
total energy, or number of satisfied pairs - fixed [55]. Usually, the hi are 
taken to be zero; when they are non-zero it amounts to external constraints 
acting on single nodes - such as one might expect in a network partially 
devoted to sensing external conditions. 

The most important parameter for this study is the overall factor of /3. 
Large values of (3 correspond to the low- noise regime; conversely, as /3 goes 
to zero, the coupling between nodes is swamped by random fluctuations. 
We refer to f3 as the inverse noise, and focus on how changing f3 leads to 
changes in how the network correlates and processes information in both 
equilibrium and non-equilibrium situations. 

Determining the correlational and information-theoretic properties of the 
networks involves finding the joint probabilities of the states of the network, 
P{{a}). In general, we are interested in quantities such as -P(o"j, aj), the joint 
probability of two nodes i and j being in the same, or opposite, switching 
states. 

There are many different approaches to finding, or approximating, P{{a}); 
they are valid in different regimes. For the construction rules we consider, for 
motif structures with maximum degree of two {i.e., chains), exact solutions 
of the Ising model are possible via a renormalization group transformation, 
and for structures with maximum degree of three, an exact solution for the 
partition function in zero field is generally possible [56j. For arbitrary mo- 
tifs, however, the partition function for the ith iteration can no longer be 
written as the partition function for the (i — l)th with a suitable change of 
coupling, J — )■ J'. 

In this paper, we adapt the "direct configurational" method (DCM; see, 
e.g., Ref. [571 ), which allows exact computation in small, finite networks. 
The self-similar properties of our networks allow these computations to be 
extended to graphs with many hundreds of nodes. 

The computation of P{{a}) can be done via the normalizing term, or 
"partition function," Z, in the denominator of Eq. [2j Derivatives of Z with 
respect to h then give moments of P{{a}). These computations not only 
provide an exact solution, but decompose graphically into sums of "paths of 
influence" closely related to the Feynman diagrams of condensed matter and 
particle physics. Appendix B describes these calculations, which provide a 
rigorous basis for the qualitative discussion of how multiple paths lead to 
critical phenomena (see, e.g., Ref. [58|.) 



(2) 
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3. Phase Transitions and the Mesoscopic Regime 

Despite the simplicity of Eq. [2| the model has a rich set of behaviors, 
including (depending on the graph structure) a critical point, /3c. The char- 
acterization of critical phenomena has been a central theme of the study of 
complex networks [55] . With exact expressions for the correlations in hand 
(see Appendix B), we can study the nature of the order-disorder transition 
on the different hierarchies presented here. 

Despite the length of the expansions - ratios of two power series in tanh /3 
to 0{N) - the general behavior of the correlation functions for different 
networks is similar, with a monotonic rise from the disordered to the or- 
dered state. The leading-order behavior in the high-noise limit as /3 — )• 
is (tanh , where r-^am is the shortest distance between the two vertices 
under consideration. The failure of this approximation is due to the increas- 
ing number of paths of influence available, which can allow longer paths to 
dominate if they increase in number quickly enough to offset the exponential 
suppression in signal. 

For branching networks, the tree-like structure suggests that a phase tran- 
sition in the bulk is prevented at non-zero noise by the nucleation of bound- 
ary spins as happens in the Cayley tree (as distinguished from the Bethe 
lattice) |60j. Since nested graphs can also be detached by a constant num- 
ber of cuts at any iteration - even when N ^ oo - the critical point in the 
thermodynamic limit is also expected to be zero [61], similar to the kind of 
ferromagnetic frustration found in random graphs |62j . 

For these reasons, it is thus useful to define a critical point for a finite 
system without reference to a thermodynamic limit but through the behavior 
of various correlations that, though never mathematically singular, do show 
the existence of a transition between two distinct behaviors. 

For the particular example of the Cayley tree, Ref. [63] introduced the 
notion of a cross-over noise, f3g. Decreasing noise, which pushed /3 above 
/3g, was associated with the emergence of non-Gaussianity, a slowdown of 
dynamics, and glassy behaviors such as aging; the critical noise parameter f3g 
goes to the infinite-size limit (l//3c goes to zero) very slowly (as log (log A^)), 
so that the thermodynamic limit is not representative even for very large 
systems. The slow approach to thermodynamic limits is often found in finite 
ramification structures such as the Cayley tree 



We investigate these systems below using analytic tools (Sec. 3.1) and 



numerical simulation (Sec. 3.2), on networks of size N ~ 300. Because of 
the extremely slow scaling discussed above, there is a significant range of 
system sizes where equilibrium properties do not vary appreciable amounts. 
This region, which we refer to as the mesoscopic regime, is qualitatively 
different from the infinite size limit. The networks we study here are in this 
regime - but so are much larger networks, and system sizes nine orders of 
magnitude larger, for example, are expected to have properties that differ 
by only a factor of two. 
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Figure 5. Stationary Aspects. Critical beliavior, as mea- 
sured by the heat capacity (top) and correlation length (bot- 
tom) for a large network composed of branching (solid line) 
and nested (dotted line) iterations on the triangle motif. The 
networks are all four-stage iterations, with 243 nodes and 363 
bonds. In the heat capacity measure, nested constructions 
show a greater concentration of accessible states at the tran- 
sition point. The correlation length for branching networks 
is initially larger than for nested, but around the cross-over 
noise the nested structures show a rapid rise. Both these ef- 
fects are driven by the existence of multiple paths between 
points in nested networks. Vertical gray lines show where 
the correlation length exceeds the average network diame- 
ter, leading to an undamped pathway. In both cases, this 
happens near the peak of the heat capacity. 

3.1. Stationary Aspects. We measure two quantities related to the sta- 
tionary, equilibrium properties of the two networks, focusing on their critical 
phenomena. First we consider the heat capacity, C (see, e.5.,Ref. [65j for 
an example of its use in biological systems.) At constant external field, we 
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have 

^' ~ Vdf ~ V d\nT ~ iV d^^ 

where S is the entropy, V the volume (here, the number of nodes), and M 
the weighted number of states accessible to the system. The heat capacity 
measures the (logarithmic) number of states accessible per (log) unit noise. 
It has a maximum, more or less sharply peaked, as a function of f3. As one 
heats the system through this point, the number of accessible states increases 
dramatically, and the intensity and variety of the cooperative phenomena in 
the transition can be quantified by the height of the peak. Nested networks 
are characterized by a greater concentration of states, as can be seen in 
the top panel of Fig. [5j they can be said to have sharper transitions to the 
ordered state. 

The approach to a phase transition is often defined in terms of a transition 
between an exponential, and power-law, decline in the correlation function 
as a function of distance. If we measure the correlation between pairs of 
nodes separated by a distance Ar, we can define a correlation length, D, 

(4) (a(r)a(r + Ar))«Xo^'^'', 

where on these inhomogeneous networks we take Ar to be the geodesic 
distance between points. The bottom panel of Fig. [5] shows the transition 
that occurs as one passes into the low-noise regime: nested networks, with 
multiple paths between distant points, allow distant parts of the network to 
correlate at /3 ~ 1. 

As noted in Sec. [T| nested networks have larger diameter. In the case of 
the g = 4 iteration, the nested networks have a maximum diameter of 32, 
compared to 9 for the more tightly structured branching networks. This 
means that at high noise (small /3), nested networks allow for greater modu- 
larity - distant parts of the network are less correlated. The transition that 
occurs at /3 ~ 1, where D for nested networks becomes much larger, reverses 
this property; nested hierarchies at low noise have stronger long-range cor- 



relations. We return to the question of modularity in Sec. 3.3, where we 
address it through simulation. 

The correlation length D exceeds the average diameter at (3 roughly 0.8 
(branching) and 0.9 (nested.) By analogy with infinite-limit, and homoge- 
nous, systems, one can consider this noise level as where the effective mass of 
long-range fluctuations becomes zero. In contrast with the standard infinite 
limit phase transition, this critical point occurs near, but not precisely at, 
the maximum of the network heat capacity. 

3.2. Dynamical Aspects. Heat capacity and correlation length are both 
static measures of modularity and signaling. We also expect dynamical 
signatures of the cross-over in finite networks. In this section, we show 
that though branching networks are much smaller in diameter (Fig. [3| than 
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nested networks, they have much longer timescales (Fig. [oj) Nodes are ac- 
tually less coupled compared to the higher-diameter nested networks, where 
multiple paths between nodes exist. 

In general there are many dynamics compatible with the stationary dis- 
tributions of Eq. [2j We take the standard Glauber dynamics [03 EH with 
each update step being associated with a different randomly chosen node. 

Given a sufhciently long time series for any pair of nodes, we can then 
measure the timescales of their dynamics. We focus here on the decay of 
the overlap, 



where a single step. At equal to one, is an update of a randomly chosen spin. 
The function C{t,tw) decays from unity (at tw equal to zero) down to (at 
noises below the glassy phase) a noise floor given by the Poisson statistics 
of uncorrelated spins. It can be used to measure a number of different 
properties, including that of aging below the spin glass transition [63j. Here 
we measure t^, the time it takes C{t,tiu) to cross a particular threshold. In 
Fig. |6j the threshold is taken to be 0.5, so that is the average time for a 
node to flip with 25% probability. Because of the long tailed distribution of 
relaxation times, we follow Ref. |i70j in estimating by the median, instead 
of the mean. 

Relaxation times for spin-glass systems are themselves time-dependent - 
the longer one lets the system run, the longer the correlation time becomes. 
This is referred to in the physics literature as 'aging' [71j - correlational 
properties depend on the age (time since initialization by random initial 
conditions) of the system. We also see evidence for time-dependent correla- 
tion functions past the critical point, similar to that found by Ref. [63|, but 
focus here on the contrasting behavior of the relaxation time at constant 
age. We are here in the strongly out-of-equilibrium regime (long timescales 
on a newly-initialized network.) 

The top panel of Fig. [6] shows how scales with f3. The strongest differ- 
ences between the two networks emerge in the low-noise (high-/?) regime. In 
particular, branching networks, with their hub-and-spoke topologies, show 
timescales more than two orders of magnitude longer than their nested coun- 
terparts. 

The differences are due to bottlenecks to communication that exist be- 
tween distant parts of the network in the branching case. Since all paths 
between distant nodes must pass through a single point, the speed of com- 
munication is limited by the timescale for that single point to change state. 



We expect other local update rules, such as Metropolis [68], to have similar dynamical 
properties, with differences appearing only on introduction of non-local rules such as those 





1=1 



of Ref. g^. 
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Figure 6. Dynamical Aspects. Top: The relaxation 
time, function of inverse noise /3. Timescales are 

shown for branching (solid line) and nested (dashed line) net- 
works. Network parameters here are the same as in Fig. [5} 
As the noise drops (/3 increases), the relaxation times diverge 
for both networks. At noises below the glassy transition, it 
is the branching networks that show a stronger slowdown, 
caused by bottleneck-frustration similar in nature to that of 
the Cayley tree. Ranges between the thinner lines enclose 
50% of samples. Bottom: distribution of at f3 = 1.5, 
showing the dispersion in relaxation times within a particu- 
lar network structure. 

This is similar to the frustration effects seen in the Cayley tree by Ref . [HS] . 
By contrast, in the nested case there are multiple paths between distant 
points, and this means that long-timescale frustration effects are limited - 
made rarer, though not completely eliminated. 
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That small world networks, if they rely on hub-and-spoke topologies, 
are actually slower, is connected to the emergence of long-lived metastable 
states. The analogs of domain walls for inhomogeneous networks - sepa- 
rated parts of the system that fall into opposite states of local consensus ~ 
emerge at low noise. These walls propagate through the system until they 
meet bottlenecks - places where disparate parts of the network connect via a 
single node - and are effectively pinned for long periods. Multiple paths, by 
contrast, increase the number of points of contact between different neigh- 
borhoods. 

The dispersion in behavior (lower panel of Fig. [6]) for both networks is 
large. This shows the effect of non-equilibrium dynamics and an (effective, 
since finite-time) breaking of ergodicity. In some cases, the initial conditions, 
after burn-in, may lead to a particularly ordered configuration from which 
the system departs with only vanishing probability. In other cases, meta- 
stable states are not as long-lived, and relaxation can happen quickly. 

That one can achieve dispersion in behavioral timescales of nearly five 
orders of magnitude from a system with only ~ 10^ nodes is remarkable. 
The dispersion, which itself sees an exponential rise at /3 ~ 1, is another 
indication of the presence of a finite-size critical phase, present only in the 
mesoscopic regime. 

3.3. Fluctuation Localization. Though branching networks are smaller 
- nodes are, on average, closer to each other - we have shown by simulation 
that the timescales of dynamical change are much longer (Fig. [g]). Mean- 
while, we can determine how many new configurations become accessible 
as the noise declines from our analytic determination of the heat capacity 
(Fig.§. 

In this section, we examine features relevant to information-processing, 
which is a property of both the stationary properties of the network (how 
many configurations are accessible) and the dynamical ones (how quickly 
one configuration turns into another.) 

In particular, we ask about the entropy of the system over finite time, and 
how and where that information is stored: locally (in single nodes), on the 
motif scale, or non-locally, across widely separated motifs. Such questions 
are essential to biological function: distinct substructures must not only pro- 
cess information by means of local motif patterns, but also communicate the 
results of that processing to more distant nodes. Anomalous concentrations 
of a metabolic product, say, may be detected by influences on one part of 
the system, but may need to trigger a transcriptional cascade in a different 
module. 

For systems where bits are largely independent, the multi-information 
is close to zero, indicating that very little information is exchanged be- 
tween subgroups. When nodes come to process information in complex 
ways, however, the multi-information becomes larger, indicating that ap- 
parent randomness at the local scale becomes pattern exchange on larger 
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scales. Formally, the multi-information at any particular scale is the de- 
crease in entropy seen when the distributions taken by the smaller scales are 
combined into a joint probability distribution. 

We measure the multi-information (see, e.g., [72]), a generalization of 
mutual information used to describe cooperative information-processing |73[ 
I74j . For the case of three subsystems, whose internal states are represented 
by a vector, we have for the multi-information 

(6) Inl = - H{P[X1,X2, Xs]) 

We consider the multi- information between the motif and global scale, 
with the three the most widely separated, but equidistant triangle motifs 
chosen as the Xi sets. In words, the first term of Eq. |6]is the total entropy 
of the subsystems considered in isolation of each other; if there were no 
long-range synchronization, this would be equal to the second term, and the 
multi- information would be zero. Conversely, since the maximum amount 
of information in the subsystem is nine bits, the maximum amount of multi- 
information is six bits (all three distant motifs perfectly correlated.) 

Fig. [T] shows these results, computed directly from simulation. We esti- 
mate the multi-information using the NSB estimator [75^ [76] - we find it 
leads to good estimates of simulated datasets with the dramatically lower 
entropies one expects past the mesoscopic critical points 

Both the branching and nested structures show a distinct window at 
which long-range synchronization is strongest and roughly half a bit can 
be communicated between distant parts of the system. At first, as noise de- 
creases, distant nodes become more correlated (as in Fig. [sj, and the multi- 
information rises; however, at low noise (large /3), fluctuations on all scales 
are frozen in. In both cases, this window appears around the same noise- 
level than the peak of the heat capacity; this provides additional support 
to the description of a mesoscopic phase transition, since more conventional 
thermodynamic systems are known to have maximum multi-information at 
the critical point [77] . 

4. Discussion 

In contrast with the regular lattices of field theory, complex networks are 
characterized by both small-scale pattern and large-scale structural diver- 
sity. On small scales, repeating network motifs [78| indicates strong local 
inhomogeneity. On large scales, networks may be characterized by modu- 
larity or by large-scale motifs visible under coarse-graining or aggregation 
of vertices [791 [50] . The study of such transformations on complex networks 
has uncovered evidence for self-similarity [H] , and small-scale and large-scale 

When used to estimate the multi-information by subtraction, we find that the estima- 
tor is not unbiased; this effect is overwhelmed, however, for multi-information measure- 
ments larger than lO"'^ bits, by the intrinsic dispersion of simulation runs. 
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Figure 7. Fluctuation Delocalization. Top: The multi- 
information between the local (motif-scale) and global scales 
for branching (solid) and nested (dashed) structures. Also 
shown are the upper and lower ranges for 50% of the net- 
works studied. In both cases, greater non-local correlations 
(high multi-information) are seen as the noise is reduced (/3 
increases) -- until a critical point at which multi-information 
declines to zero, indicating that information processing has 
become local again. The heavy line at ~ 0.01 indicates the 
1(7 errors associated with the NSB estimator. Bottom: dis- 
tributions near the peak of the multi-information (branching 
at /3 = 0.9; nested at /3 = 1.1) showing the dispersion in 
measurements. 




network structures, for example, are found to be correlated in cellular net- 
works [81]. 
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diameter 
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short distance 


long distance 


phase transitions 


soft 


hard 


dynamical 






timescales 


slow 


rapid 


low-noise processing 


local 


global 



Table 1. Summary of Results. The behaviors of con- 
trasting self-similar networks in the mesoscopic regime. 

In this contribution, we compared two alternative topologies - networks 
of branching motifs (with one pattern replicated many times) and networks 
of nested motifs (where patterns play the role of templates.) Branching net- 
works have a familiar tree-like structure and possess the small-world prop- 
erty; their benefits include efficient signal propagation at high noise. Nested 
networks retain self-similarity but without small-world scaling, and confer 
benefits such as redundant paths between distant nodes at the cost of longer 
path lengths. 

A central theme has been the difference at the onset of a mesoscopic ver- 
sion of a phase transition. Phase transitions in general occur in networks 
when the exponential fading of a correlation along a particular path is bal- 
anced by the exponential increase in the number of paths between the two 
points [58]. In complex networks, this implies that structural inhomogene- 
ity on a range of different scales will be relevant for the critical behavior 
analogous to that found in more regular systems. 

Our investigation has uncovered a number of counter-intuitive properties 
of small-world systems. Smaller diameter networks adjust more slowly, have 
shorter correlation lengths, and can not achieve the levels of non-local inte- 
gration seen in those nested systems. Our analytic exposition of the problem 
shows explicitly how the onset of correlations are driven by the existence of 
multiple paths between points; our simulations show how the existence of 
such paths allows for the more rapid dissipation of inhomogeneity. Multiple 
paths are thus central for both information processing and the timescales of 
coordination. 

In some cases, the characteristic features of the small- world topology listed 
in Table [T] are desirable. They can lead to greater modularity, and longer 
timescales, than they would for more "open" topologies with longer path 
lengths. At low noise, their fluctuations are more localized, meaning that 
fluctuations in distant structures are increasingly independent, and disjoint 
memories do not merge and fade as fast. Depending on the nature of com- 
putation, these may be desirable properties - as they are, for example, in 
the case of the liquid state model [82] . 
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The existence of such paths also bears on the question of network ro- 
bustness - particularly under targeted attack [83]. When all correlational 
information between two nodes must travel along a single path, the failure 
of any intermediate node is catastrophic. Conversely, robustness to node 
deletion will, in general, increase as the number of distinct paths between 
points increases, even if the number of edges remains constant. 

We suggest that our work is particularly relevant to the study of informa- 
tion processing in the brain [831 [85]. On the one hand, the maximum entropy 
model of Sec. [2] has formed the basis of a powerful set of models for the de- 
scription of observed neural correlations [86] , and the information-theoretic 
quantities we have investigated are directly related to the Tononi (/> mea- 
sure [57] and the Cat measure of Ref. [88]. These latter measures consider 
various bi-partitions; the fractal structure of our networks naturally suggest 
extensions of these measures to the tower of higher-order correlations as 
described in Ref. j72] . 

One the other hand, the multi-scale structure of the brain - from scales 
of 50 //m to centimeters - is well-established [89, and refs. therein]. The 
topological and dynamical properties of certain random and deterministic 
self-similar wirings, relevant to neuroscience, have been under recent inves- 
tigation \89\ [90] . Our work has direct bearing on explicit models of cortical 
network architecture |91j . and in particular suggests that small- world path 
lengths may not be the only way in which a network might optimize infor- 
mation processing. 

Self-similar network properties have proven relevant to the study of a 
vast range of other natural systems, from gene-regulatory [10[lllj and meta- 
bolic networks [12j, all the way up to food webs [17J and human social 
networks |13H16] . In the case of social networks, for example, branching 
networks with complete-graph motifs are small-world examples of the ro- 
bust social quilts studied by Ref. [92j, while "span of control" theories [93] 
address the consequences of hierarchy for information processing and dynam- 
ics [SI] . Hierarchical structure may also be associated with the emergence of 
long timescales associated with strategic information processing in animal 
systems [93] . 

In parallel, the maximum-entropy models we consider here have proven 
useful not only in studies of neural functioning, but also in studies of the 
immune system [96], and animal behavior j971 198] . In many cases, such 
systems are found at criticality [99], making it important to understand the 
mesoscopic regime. 

The analysis of this paper suggests that statistics related to the existence 
of multiple paths in a network may be an important way to determine how 
relevant structural features have been organized to achieve the contrasting 
properties found in Table [T} It may not be necessary to compute all the 
relevant Feynman diagrams in a graph to answer central questions about 
the nature of the critical point and ordered phase. When calibrated against 
the exactly-solved models of this paper, statistics related to the scaling of 
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the number of paths between vertices as a function of distance may be 
sufficient to study both the nature of the critical point and the existence of 
non-equiUbrium effects. We leave this question for future investigation. 

Most theoretical studies have focused on comparing the functional impli- 
cations of self-similar and non-self-similar networks. We have found none 
that consider the functional implications of alternative self-similar networks. 
If it proves to be true that constraints of development account for, and im- 
pose, wide spread network self-similarity, then variations on a fractal theme 
will become the principle means by which development might tinker with 
functionally important properties. 
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6. Appendix A: Formal Definitions of Branching and Nested 

Networks 

Beginning with a motif, M, with N{M) vertices, we build up, by iteration, 
a larger structure, S{q,M), where q is the number of iterations and S{q = 
0, M) is M. The motif directs the assembly of increasingly larger structures, 
in a recursive fashion, providing the graph with both small and large scale 
inhomogeneity. 

At the qth. iteration, replace the vertices in S{q — 1, M) by separate copies 
of M, and rewire the system while maintaining the local motif structure. 
One might take the vertices of a triangle, for example, and replace each of 
them by a copy of the same three-node structure. The different ways to 
accomplish this model how a network may develop the internal structure 
of its subsystems; going the other direction, a particular choice defines a 
coarse-graining operation that might form an element of a renormalization 
group. 

More formally, for each vertex z in M, replace the vertex by a copy of 
M, Mi. For each vertex j in M, connect the free edges - those remaining 
from the previous iteration that were attached to vertex i - to the internal 
vertices of Mj, by some mapping f{i,j) (generally not symmetric.) 

At the qth iteration, take M and replace each vertex z in M with copies 
of S{q — 1,M). The rewiring now takes a edge from the jth subunit to the 
ith subunit, and attaches it to the f{i,j) vertex in S{q — 1, M). The f{i,j) 
vertex for 5(2, M) is defined as the f{i,j)th vertex in the f{i,j)th. subunit, 
and so forth for higher values of q. 

Graph S{q, M) has N{MY+^ vertices and n(M) YH=o N{M)\ or n{N^+'i- 
1)/{N — 1), bonds. The average degree of S is always close to that of the 
local graph M, so that sparse networks remain sparse; however, the higher 
moments of the degree distribution may grow dramatically depending on 
the choice of /. 

Going from S{q, M) to S{q — 1, M) is a form of renormalization [l3]. Once 
M is chosen, the remaining choice is that of the assembly rule, f{i,j)', we 
consider the two simplest cases f{i,j) equal to i (branching assembly), and 
to j (nested assembly.) These operations are easier to see graphically; for 
the example of a triangle motif being replicated at multiple scales by the 
two methods, see Fig. [ij For a more explicit example of how the f{i,j) rule 
works, see Fig. [8j 



7. Appendix B: Ising Solutions in the Direct Configurational 

Method 



In Sec. 3.1 we examined stationary properties of the system. Because of 



the divergence of timescales discussed in Sec. 3.2, it is difficult to determine 



reliable measurements of these properties from simulation. Here we discuss 
the "Direct Configurational Method" (DCM), which allows one to write 
down expressions for these properties analytically. The expressions are long, 



DYNAMICS AND PROCESSING IN FINITE SELF-SIMILAR NETWORKS 



21 




Figure 8. An example of how the f{i,j) function specifies 
which fine-grained node to connect to, for the first iteration 
of the nested case, f{i,j) = j- 

but tractable by analytic methods that use computer algebra. They enable 
us to separate finite-size-finite-time effects (accessible by simulation) from 
finite-size-infinite-time effects associated with equilibrium. 

A number of different expansions for the correlations can be written in 
the high-noise (i.e., /3 <^ Pc) limit. The most common, known as the linked- 
cluster expansion |100j , has formed the center of studies of the Ising model 
on regular lattices |101H103| . 

Because of the attention paid to lattices with great amounts of symmetry 
and of infinite extent, less often used are the exact solutions, expressible as a 
power series with a finite number of terms (tanh/3J)", available for lattices 
of finite size. This "direct configurational method" (Ch. 2, Ref. [57J) allows 
one to write an expression for the partition function of a graph directly, by 
enumerating all of the subgraphs (including disconnected subgraphs) of the 
original lattice with all vertices even. Similar expressions, with some ver- 
tices "rooted" in various ways, allow one to determine correlation functions 
through partial derivatives of Z. 

For a system of any appreciable size, enumerating the disconnected graphs 
is a nearly impossible computational task. Finding the free energy, equal 
to InZ, turns such a sum of disconnected graphs into a far shorter sum 
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involving only connected graphs, with multiple bonds between vertices al- 
lowed, weighted in a new fashion (Ch. 20, Ref. |104j ). These are the usual 
Feynman diagrams, and allow one to handle an arbitrarily large lattice to 
finite order in /3. When the lattice has translation symmetries, bond- and 
vertex-renormalization \10U\ I1U5H1U7] becomes possible, making computa- 
tions to very high order possible (currently around 20th order |108| .) 

In the case of a biological network, however, many of these techniques 
become impractical; the standard renormalization procedures are frustrated 
by the strong inhomogeneity in the network, and the unrenormalized graphs 
are far more numerous and still require computation of the symmetry factors. 
When a network is characterized by repeating motifs within a larger lattice, 
however, the enumeration of subgraphs becomes plausible. 

In the DCM, to compute the partition function, Z, on a graph G, we 
take all subgraphs g of G with vertices even; this set is written E{G) and 
includes disconnected subgraphs. We can then write 



V 



(7) Z = 2^(«)(cosh/3J)"(^) Yl 

g&E{G) 

where n{g) is the number of edges in graph (or subgraph) g, N{g) the number 
of vertices, and v is tanh /? J. We take E{G) to include the "null graph" with 
no edges. Finding the derivatives of Z with respect to a set of external fields 
amounts to allowing some vertices to be odd. We write, for example, 

(8) = 2^(^)(cosh/3J)"(^) ^"^''^ 

geE{G,a,b) 

where E{G, a, b) are the subgraphs with all vertices even, when the effective 
number of edges coming in to vertices a and b are both incremented by one 
(note that E{G, a, a) is the same as E(G). Then, 

^"'^"''^ = Z9M)^ = ^' 

and higher-order (connected) correlations yet can be computed as 

In Z 



Direct enumeration of all possible disconnected subgraphs rapidly be- 
comes prohibitive, since computation time is exponential in the number 
of edges. For the motifs, however, with small n{M) (less than 10, e.g.), 
the computation can be done on a modern desktop machine. Our general 
method will be to compose the partition function for S{q, M) from the par- 
tition function for S{q — 1,M). 

7.1. Branching Netw^orks in the Ising Model. Determining the parti- 
tion function for the branching assembly rule is reasonably straightforward. 
It is aided by the tree-like hierarchy that arises as the graph is built up; all 
disconnected, even graphs at any stage can be decomposed into the union 
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of the set of disconnected graphs on the N{M) subgraphs S{q — 1,M) and 
the disconnected graphs on the additional motif M that now forms the 
"highest-level" of the network. 



V 



n{r, 



(11) Z, = 2^W(cosh/3J)"Wzf_W 

m£E{M) 

The free energy per vertex, In Zq/N'^, is a slowly decreasing function of q. 

Computing the correlation function of such a system is again aided by 
the tree- like hierarchy. At stage q, copies of the S{q — 1,M) graph are 
placed at the N{M) locations. A vertex A on one of those copies can then 
be referenced by a string of q numbers {a,i, .... Og}. where a,j is the vertex 
number of M into which the S{q — 1, M) graph containing A is placed. 

Consider, to begin with, the correlation function between vertex A, {ai, 02}, 
and B, {61, 62} in S'(2, M). When the roots are found on different subgraphs 
(i.e., 02 ^ 62), 

(12) {aAaB) - Y^dhAdhB ~ Y^^'^^'^'^''^^''^'^'^''^' 

where Pi^ab is the sum of all graphs on M even in all vertices except at 
vertices a and h which are odd ( "subgraphs of M rooted at a and 6" ) : 



(13) Pl,ah= E ^" 



,,n(m) 
meE{M,{a,b}) 

In words, the path from A to B requires leaving the subgraph containing 
A at 02, crossing M, and entering the subgraph containing B at 62- The 
additional factors of Zi, the partition function on M, come from the other 
subgraphs that, if they are are entered, must be left from the same vertex. 
The generalization to n roots is straightforward. 
The general form for P can be written 

Pq,{a}{b} — Pq-l,{ai...aq-i}{aq...aq}Pl,aqbq 

(14) XPq-l,{bq...bq}{bl...bq-l}^q-l 

or, in words, that one must get to the most connected node on one's sub- 
graph, and from there travel over the highest-level M to the most connected 
node of the destination subgraph. 

We consider two vertices {a} and {b} to be separated by a copy distance 
d where d is the number of subgraphs one must traverse to reach B from A 
(formally, if aa / bd but either d is the generation of the graph or a^+i = 
bd+i-) The correlation function has the form of an exponential cutoff: 

(15) ^^'^^"^Td^ ^ (^ACTs)~x^, 

' ^ ^' {A,B}eV{d) 

where V{d) is the set of all vertex pairs separated by copy distance d, and 
Xo is the average correlation between different pairs in M. 
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In nesting, local interactions are increasingly less aware of the larger struc- 
tures in which they are embedded; as copy distance increases, correlations 
die exponentially. Furthermore, the correlation between two vertices de- 
pends only on their relative positions in the hierarchy; the pair is insensitive 
to the extent of the rest of the graph. 

These effects, are due to the way in which the subgraphs are wired to- 
gether; all interactions between different subgraphs pass through a single- 
vertex bottlenecks that restrict the number of paths. In the next section, 
we shall see how the nested construction opens these bottlenecks - at the 
cost of larger graph diameters - and alters the critical behavior. 

7.2. Nested Networks and the General Form. The branching compu- 
tations were reasonably simple because of the absence of redundant paths, or 
loops, above the motif scale. (Formally, the difference - the set of unshared 
edges - between two paths decomposes into a union of even subgraphs on the 
motif M.) The absence of larger redundant paths has many implications in 
addition to how it affects the correlation functions; for example, connections 
between distant nodes may be cut by removal of a single vertex. 

The nested rule partition function appears harder to compute because of 
the existence of loops and redundant paths on all scales. However, a general 
algorithm for the computation of an arbitrary Pq^{a},{b} be specified. 
One decomposes the problem into two parts. One first considers how to 
traverse the "coarse-grained" graph, at the highest level; and then considers 
how to travel "within" each coarse-grained vertex to complete the path. 
The difference between nested and branching then amounts simply to which 
particular node address on subgraph A allows you to jump to subgraph B. 

More formally, Pq^{a},{b} is the sum over on the motif M in the following 
way: 

(1) At level q, one has a set of roots, {ai, . . . , Og}, . . . , bq}, ... . Each 
of these roots corresponds to a root in one of the S{q—1, M) copies. 
For example, the copy number has a root {ai, . . . , a^-i}. 

(2) Consider in turn each subgraph m in motif M (where m can be 
disconnected or connected, odd or even). 

(3) Each edge of that subgraph gives two additional roots, one associated 
with each end of the edge. For example, an edge between nodes Uq 
and bq leads to two new roots, one for the copy Og, and one for the 
copy bq. 

(4) If the graph has been constructed by branching, the additional root 
for the aq copy is {uq . . . a^} (a list q — I entries long.) 

(5) If the graph has been constructed by nested iteration, the additional 
root for the Og copy is {bq . . . bq} (a list q — 1 entries long.) 

(6) Thus, for each S{q — 1, M) copy, we have a set of roots, rj. 

(7) Add together and the product of the N{M) Pq^i,r,- 

Note that ensuring the final path is even is deferred to the bottom level, 
when Pi^n is computed. 
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Eq. [7] is the basis of the direct configurational method; some examples of 
this expression for small graphs can be found in Ref. |109j . While enumera- 
tion of graphs much larger than 30 bonds is impossible, using the methods 
described in the text, it is possible to build up much larger graphs with 
branching and nested properties of interest. With these equations, and 
Eqs.[TT]and 14 an arbitrary hierarchy may be constructed, since there is no 
restriction on the form of Pq-i- 

We have checked the central formulae explicitly through subgraph enu- 
meration on Fig. [2| the results for three iterations we have checked through 
seventh order in /?, and thus in v, by an unrenormalized linked-cluster ex- 
pansion, using Feynman diagrams in the standard fashion \100\ 1110] . 
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